Skip to content

docs: remove docs code reference#674

Merged
johnnygreco merged 4 commits into
mainfrom
andreatgretel/docs/remove-code-reference-docs
May 21, 2026
Merged

docs: remove docs code reference#674
johnnygreco merged 4 commits into
mainfrom
andreatgretel/docs/remove-code-reference-docs

Conversation

@andreatgretel
Copy link
Copy Markdown
Contributor

@andreatgretel andreatgretel commented May 18, 2026

📋 Summary

Removes the generated code reference docs from both MkDocs and Fern so the docs no longer publish or link to the retired API reference surface. This also removes the generation plumbing and adds publish-branch cleanup for archived Fern versions so stale reference pages do not survive in docs-website archives.

🔗 Related Issue

N/A

🔄 Changes

🗑️ Removed

  • Deleted the MkDocs docs/code_reference/** pages, Fern fern/versions/latest/pages/code_reference/** pages, mkdocstrings CSS, and py2fern normalization script.
  • Removed code reference nav/config, dependency entries, Make targets, workflow env, and ignored Fern artifacts.
  • Removed stale reference links from MkDocs/Fern concept and plugin docs, plus contributor and agent docs.

🔧 Changed

🔍 Attention Areas

⚠️ Reviewers: Please pay special attention to the following:

  • fern/scripts/fern-published-branch.py - Archived Fern versions copy cleaned current versions of the affected concept/plugin pages during publish sync so stale reference links are removed from historical docs.

🧪 Testing

  • .venv/bin/ruff check --fix .
  • .venv/bin/ruff format .
  • make check-fern-docs passes with 0 errors and 2 existing warnings
  • .venv/bin/mkdocs build passes with existing docs warnings
  • git diff --check
  • Source keyword sweep for retired reference strings
  • docs-website dry-run sync plus make check-fern-docs
  • Claude review and follow-up found no actionable findings
  • make test passes (N/A - docs-only; not run)
  • Unit tests added/updated (N/A - no testable logic)
  • E2E tests added/updated (N/A - docs-only)

✅ Checklist

  • Follows commit message conventions
  • Commits are signed off (DCO)
  • Architecture docs updated (N/A - no architecture changes)

@andreatgretel andreatgretel marked this pull request as ready for review May 18, 2026 21:37
@andreatgretel andreatgretel requested a review from a team as a code owner May 18, 2026 21:37
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 18, 2026

MkDocs preview: https://8995553c.dd-docs-preview.pages.dev

Fern preview: https://nvidia-preview-pr-674.docs.buildwithfern.com/nemo/datadesigner

Fern previews include the docs-website version archive with PR changes synced into latest. Notebook tutorials are rendered without execution outputs in previews.

@github-actions
Copy link
Copy Markdown
Contributor

PR #674 Review — docs: remove docs code reference

Summary

This is a docs-only PR (79 additions / 1690 deletions) that removes the
generated API reference surface from both publishing pipelines:

  • MkDocs: deletes docs/code_reference/**, drops mkdocstrings /
    mkdocstrings-python from pyproject.toml and uv.lock, removes
    the mkdocstrings CSS, and trims mkdocs.yml nav.
  • Fern: deletes fern/versions/latest/pages/code_reference/**, removes
    the Code Reference nav section from fern/versions/latest.yml, drops
    the libraries: block and all /code-reference/* redirect rules from
    fern/docs.yml, removes py2fern from deps, and deletes
    fern/scripts/normalize-py2fern-indexes.py.
  • Plumbing: removes the generate-fern-api-reference[-native] Make
    targets, the DOCS_PY2FERN workflow env, and the fern/code-reference/
    gitignore entry.
  • Concept/plugin pages: rewrites stale /code-reference/... links into
    short prose mentions (columns, custom_columns, model-configs,
    person_sampling, security, tool_use_and_mcp, validators,
    plugins/example, plugins/overview).
  • Agent docs: updates .agents/, CONTRIBUTING.md, fern/AGENTS.md,
    and fern/README.md so they no longer reference the retired surface.

The only behavioral change is in fern/scripts/fern-published-branch.py,
which now strips the Code Reference nav and code_reference page tree
from archived Fern versions during publish sync, and refreshes the
affected concept/plugin pages in those archives so the inline links
stripped on latest also disappear from historical docs.

Findings

Correctness

  • remove_retired_reference_archive flow looks sound. It runs after
    clear_published_tree + source copy + merge_preserved_versions, so
    the published tree at this point is: source latest (no
    code_reference) + preserved v* versions (which may still have
    code_reference). The script (a) strips the nav block from every
    v*.yml, (b) deletes any */pages/code_reference directory under
    versions/, and (c) overlays the cleaned-on-latest versions of the
    9 affected concept/plugin pages into each v*/pages/. That's a
    consistent end state.
  • remove_navigation_section shares the same end-of-block heuristic
    as extract_/replace_navigation_section
    (next line that
    startswith(" - ") and is non-empty). For a section that is last
    in the file, end falls through to len(lines), which is the desired
    behavior. ✅
  • glob("v*/pages") is intentionally narrow — it only matches
    version directories whose names start with v, matching the
    REDIRECT_VERSION_RE convention elsewhere in this file. If a future
    archive uses an older-versions/... shape, page refreshes there would
    be skipped silently. Not a regression for this PR; worth noting if
    archive naming ever broadens.
  • glob(f"*/pages/{RETIRED_REFERENCE_DIR}") would also match
    latest/pages/code_reference, but latest no longer contains that
    directory after the source copy, so the broader glob is harmless and
    keeps the cleanup robust against a stray re-add.
  • Redirect removal is a deliberate trade-off. All the
    /nemo/datadesigner/code_reference/*/code-reference/* redirects
    in fern/docs.yml are deleted. Users following indexed search
    results to /code_reference/... will now get 404s instead of being
    redirected to the (nonexistent) new code-reference pages. Since the
    destination is also gone, redirecting wouldn't help — but you may want
    to consider a single redirect of the code_reference root to
    /concepts/columns or the API overview page. Not blocking; a product
    call.
  • fern-published-branch.py lines 18-19 use split string literals
    ("Code " + "Reference", "code" + "_reference").
    This is a
    workaround for the "source keyword sweep for retired reference
    strings" check listed in the PR's testing checklist. It works, but
    it's the kind of cleverness that future maintainers will revert
    without realizing why. A one-line # noqa-style comment explaining
    the sweep would prevent that — e.g. # Split to satisfy the retired- reference keyword sweep; do not collapse. Optional.

Conventions

  • Matches the surrounding style of fern-published-branch.py:
    module-level constants, from __future__ import annotations, modern
    type annotations (list[str], re.Pattern[str]), no relative
    imports, PublishedBranchError for failures. ✅
  • Concept-page rewrites preserve voice and Markdown link conventions
    used elsewhere in fern/versions/latest/pages/concepts/.
  • pyproject.toml and uv.lock are kept in sync; transitive removal
    of astroid (mkdocstrings → griffe → astroid) is correctly reflected
    in the lockfile.
  • Makefile .PHONY list is kept in sync with the deleted targets.
  • Typo in the existing source line at
    fern/versions/latest/pages/concepts/person_sampling.mdx:43
    ("For mor details") is replaced rather than corrected — fine for this
    PR, but a free fix you could land alongside.

Performance

  • No runtime/library performance impact. Loss of the docs build step
    for the API reference will modestly speed up make check-fern-docs
    and the docs-preview workflow.

Test coverage

  • Docs-only; no logic tests are required for the deletions.
  • fern-published-branch.py has no unit tests in this repo (pre-existing
    state). The new remove_retired_reference_archive is therefore
    exercised only by the publish dry-run noted in the PR checklist
    ("docs-website dry-run sync plus make check-fern-docs"). Adding a
    small pytest around the YAML mutation helpers would be a low-cost
    follow-up but is out of scope here.
  • The PR explicitly verifies make check-fern-docs (0 errors, 2
    pre-existing warnings) and mkdocs build. Adequate for a docs PR.

Security

  • No secrets, no new network calls, no executable changes outside the
    publish-sync script. shutil.rmtree is constrained to paths inside
    published_root / "fern" / "versions" (the script's own temp
    workspace), so no risk of overreach.
  • No prompt-injection / external-content concerns.

Risks / things to double-check after merge

  1. Inbound links from external sources (Google, blog posts,
    internal NVIDIA wiki) pointing at /code_reference/... or
    /code-reference/... will 404. If telemetry shows non-trivial hits,
    consider a single catch-all redirect to a relevant concept page.
  2. docs-website archive cleanup runs only at next publish. Until
    then, archived versions on the live site still surface broken
    /code-reference/... links from their concept pages. The PR's
    approach (refresh from latest on publish) handles this on the
    next run; just be aware the gap is one publish cycle.
  3. fern/AGENTS.md still references code-reference/ in some
    commentary (worth a final grep before merging).

Verdict

Approve / non-blocking comments only. This is a clean, well-scoped
removal of a retired surface. The sole logic change in
fern-published-branch.py is straightforward and consistent with the
existing nav-mutation helpers. Two optional follow-ups: (a) add a
brief comment explaining the split string literals, and (b) consider a
single catch-all redirect to soften the 404 cliff for external
inbound links. Neither blocks merge.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 18, 2026

Greptile Summary

Removes all code-reference documentation from MkDocs and Fern (pages, nav entries, generation tooling, and CSS), and updates the publish-branch sync script to strip retired reference sections and directories from historical archived versions.

  • Deletes 32 docs/code_reference/** and 20 fern/versions/latest/pages/code_reference/** pages, the normalize-py2fern-indexes.py script, MkDocs mkdocstrings CSS, and all Makefile targets and pyproject dependencies tied to the py2fern pipeline.
  • Updates fern/scripts/fern-published-branch.py to introduce remove_retired_reference_archive, which removes the "Code Reference" nav block, deletes the code_reference page trees, and overwrites stale concept/plugin pages in every archived version with their current link-cleaned equivalents.
  • Replaces direct code-reference links throughout concept and plugin docs with plain-text method names or redirects to tutorial pages.

Confidence Score: 5/5

A well-scoped docs-only cleanup; all deleted pages and nav entries are for the retired code reference surface, and the publish-script logic for stripping archived versions mirrors the pre-existing section-detection pattern.

All changes are documentation deletions or link cleanups. The only executable logic touched is the Fern publish script, where the new remove_retired_reference_archive function uses the same section-boundary detection algorithm that the predecessor sync_code_reference_archive already relied on. Ordering of operations (remove nav → delete pages → refresh concept/plugin pages → materialize) is correct, and the guard if source_file.exists() and target_file.exists() prevents creating pages that never existed in older archives.

No files require special attention.

Important Files Changed

Filename Overview
fern/scripts/fern-published-branch.py Replaces sync_code_reference_archive with remove_retired_reference_archive; new remove_navigation_section helper correctly identifies section boundaries using the same start/end detection logic as the pre-existing extract_navigation_section. Obfuscated string constants are intentional to prevent self-matching in keyword sweeps.
fern/versions/latest.yml Removes the entire 'Code Reference' nav block (70 lines) from the latest Fern version navigation; remaining nav structure is intact.
fern/docs.yml Removes the libraries/data-designer-config Fern library input block and all code_reference redirect rules; comment on mkdocstrings also cleaned up.
Makefile Removes generate-fern-api-reference targets and py2fern variables; prepare-fern-docs dependency simplified; .PHONY list updated accordingly.
.github/workflows/docs-preview.yml Removes DOCS_PY2FERN variable from the check-fern-docs make invocation; no other logic changes.
mkdocs.yml Removes code_reference nav section and mkdocstrings plugin/CSS config from MkDocs; existing nav structure is otherwise unaffected.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[sync_source called] --> B[Preserve archived versions to tmpdir]
    B --> C[Clear published tree]
    C --> D[Copy source repo into published root]
    D --> E[merge_preserved_versions]
    E --> F[remove_retired_reference_archive]
    F --> F1[Remove 'Code Reference' nav block\nfrom each archived v*.yml]
    F --> F2[Delete */pages/code_reference dirs\nin archived versions]
    F --> F3[Overwrite concept/plugin pages\nin archived v* with clean latest versions\nonly if target already exists]
    F1 & F2 & F3 --> G[materialize_version_nav_pages]
    G --> H[restore_versions_block]
    H --> I[validate_redirect_targets]
    I --> J[write_publish_metadata]
Loading

Reviews (4): Last reviewed commit: "Merge branch 'main' into andreatgretel/d..." | Re-trigger Greptile

@johnnygreco
Copy link
Copy Markdown
Contributor

Thanks for putting this together, @andreatgretel!

Summary

This removes the MkDocs/Fern generated code reference surface, its generation plumbing, dependency entries, nav, redirects, and published-archive cleanup path. The implementation matches the PR description: source-tree sweeps are clean for the old paths/tools, and a dry-run of fern-published-branch.py sync-source against the current docs-website archive removed the retired archive nav/pages cleanly.

Findings

Warnings — Worth addressing

.agents/agents/docs-searcher.md:66 — Last generated-reference breadcrumb remains

  • What: The docs search agent now says to "Prioritize user guides and examples over generated reference material when both exist." Since this PR removes the generated reference material entirely, this leaves a conceptual reference to the retired surface.
  • Why: It is not a broken public docs link, but it weakens the clean-removal story for agent-facing docs and can send future agents looking for a docs category that no longer exists.
  • Suggestion: Remove this bullet, or rephrase it around the docs that actually remain, e.g. Prioritize user guides, concepts, tutorials, and recipes according to the user's task.

What Looks Good

  • The main removal is broad and tidy: deleted pages, nav entries, Make targets, workflow env, generated-artifact ignore rules, docs dependencies, and lockfile entries are all covered.
  • The Fern published-branch cleanup is doing the important archive work: in a local dry-run against docs-website, the stale versioned code_reference directories and nav sections were removed, and the affected concept/plugin pages were refreshed.
  • The public docs link cleanup is consistent across MkDocs and Fern mirrors; the remaining code reference hits I found are generic code-symbol audit wording, not links to the removed docs section.

Verdict

Needs changes: please remove or reword the remaining generated-reference breadcrumb in .agents/agents/docs-searcher.md.


This review was generated by an AI assistant.

@andreatgretel
Copy link
Copy Markdown
Contributor Author

thanks for the careful review! fixed that last breadcrumb by rewording it around guides, concepts, tutorials, and recipes. pushed in 20555a7.

@johnnygreco johnnygreco merged commit b6de38d into main May 21, 2026
51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants